Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 1253 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 285.8 KiB |
| Average record size in memory | 233.6 B |
Variable types
| Categorical | 2 |
|---|---|
| Numeric | 13 |
FECHA_DEF has a high cardinality: 629 distinct values | High cardinality |
NEUMONIA is highly correlated with EDAD and 11 other fields | High correlation |
EDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
DIABETES is highly correlated with NEUMONIA and 11 other fields | High correlation |
EPOC is highly correlated with NEUMONIA and 11 other fields | High correlation |
ASMA is highly correlated with NEUMONIA and 11 other fields | High correlation |
INMUSUPR is highly correlated with NEUMONIA and 11 other fields | High correlation |
HIPERTENSION is highly correlated with NEUMONIA and 11 other fields | High correlation |
OTRA_COM is highly correlated with NEUMONIA and 11 other fields | High correlation |
CARDIOVASCULAR is highly correlated with NEUMONIA and 11 other fields | High correlation |
OBESIDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
RENAL_CRONICA is highly correlated with NEUMONIA and 11 other fields | High correlation |
TABAQUISMO is highly correlated with NEUMONIA and 11 other fields | High correlation |
CLASIFICACION_FINAL is highly correlated with NEUMONIA and 11 other fields | High correlation |
NEUMONIA is highly correlated with EDAD and 11 other fields | High correlation |
EDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
DIABETES is highly correlated with NEUMONIA and 11 other fields | High correlation |
EPOC is highly correlated with NEUMONIA and 11 other fields | High correlation |
ASMA is highly correlated with NEUMONIA and 11 other fields | High correlation |
INMUSUPR is highly correlated with NEUMONIA and 11 other fields | High correlation |
HIPERTENSION is highly correlated with NEUMONIA and 11 other fields | High correlation |
OTRA_COM is highly correlated with NEUMONIA and 11 other fields | High correlation |
CARDIOVASCULAR is highly correlated with NEUMONIA and 11 other fields | High correlation |
OBESIDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
RENAL_CRONICA is highly correlated with NEUMONIA and 11 other fields | High correlation |
TABAQUISMO is highly correlated with NEUMONIA and 11 other fields | High correlation |
CLASIFICACION_FINAL is highly correlated with NEUMONIA and 11 other fields | High correlation |
NEUMONIA is highly correlated with EDAD and 11 other fields | High correlation |
EDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
DIABETES is highly correlated with NEUMONIA and 11 other fields | High correlation |
EPOC is highly correlated with NEUMONIA and 11 other fields | High correlation |
ASMA is highly correlated with NEUMONIA and 11 other fields | High correlation |
INMUSUPR is highly correlated with NEUMONIA and 11 other fields | High correlation |
HIPERTENSION is highly correlated with NEUMONIA and 11 other fields | High correlation |
OTRA_COM is highly correlated with NEUMONIA and 11 other fields | High correlation |
CARDIOVASCULAR is highly correlated with NEUMONIA and 11 other fields | High correlation |
OBESIDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
RENAL_CRONICA is highly correlated with NEUMONIA and 11 other fields | High correlation |
TABAQUISMO is highly correlated with NEUMONIA and 11 other fields | High correlation |
CLASIFICACION_FINAL is highly correlated with NEUMONIA and 11 other fields | High correlation |
SEXO is highly correlated with CLASIFICACION_FINAL | High correlation |
NEUMONIA is highly correlated with EDAD and 11 other fields | High correlation |
EDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
DIABETES is highly correlated with NEUMONIA and 11 other fields | High correlation |
EPOC is highly correlated with NEUMONIA and 11 other fields | High correlation |
ASMA is highly correlated with NEUMONIA and 11 other fields | High correlation |
INMUSUPR is highly correlated with NEUMONIA and 11 other fields | High correlation |
HIPERTENSION is highly correlated with NEUMONIA and 11 other fields | High correlation |
OTRA_COM is highly correlated with NEUMONIA and 11 other fields | High correlation |
CARDIOVASCULAR is highly correlated with NEUMONIA and 11 other fields | High correlation |
OBESIDAD is highly correlated with NEUMONIA and 11 other fields | High correlation |
RENAL_CRONICA is highly correlated with NEUMONIA and 11 other fields | High correlation |
TABAQUISMO is highly correlated with NEUMONIA and 11 other fields | High correlation |
CLASIFICACION_FINAL is highly correlated with SEXO and 12 other fields | High correlation |
FECHA_DEF is uniformly distributed | Uniform |
Reproduction
| Analysis started | 2021-12-10 20:38:07.904319 |
|---|---|
| Analysis finished | 2021-12-10 20:38:45.834455 |
| Duration | 37.93 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 629 |
|---|---|
| Distinct (%) | 50.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 82.1 KiB |
| 2021-09-30 | 2 |
|---|---|
| 2021-09-24 | 2 |
| 2021-10-27 | 2 |
| 2020-04-16 | 2 |
| 2020-11-07 | 2 |
| Other values (624) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | 2020-03-18 |
|---|---|
| 2nd row | 2020-03-18 |
| 3rd row | 2020-03-20 |
| 4th row | 2020-03-22 |
| 5th row | 2020-03-23 |
Common Values
| Value | Count | Frequency (%) |
| 2021-09-30 | 2 | 0.2% |
| 2021-09-24 | 2 | 0.2% |
| 2021-10-27 | 2 | 0.2% |
| 2020-04-16 | 2 | 0.2% |
| 2020-11-07 | 2 | 0.2% |
| 2020-06-10 | 2 | 0.2% |
| 2021-07-25 | 2 | 0.2% |
| 2021-06-30 | 2 | 0.2% |
| 2021-09-04 | 2 | 0.2% |
| 2021-04-23 | 2 | 0.2% |
| Other values (619) | 1233 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 2021-09-30 | 2 | 0.2% |
| 2021-01-03 | 2 | 0.2% |
| 2021-02-24 | 2 | 0.2% |
| 2021-07-15 | 2 | 0.2% |
| 2021-01-29 | 2 | 0.2% |
| 2021-01-11 | 2 | 0.2% |
| 2021-06-08 | 2 | 0.2% |
| 2021-07-27 | 2 | 0.2% |
| 2020-11-19 | 2 | 0.2% |
| 2021-06-26 | 2 | 0.2% |
| Other values (619) | 1233 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 76.6 KiB |
| Hombre | |
|---|---|
| Mujer |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.501197127 |
| Min length | 5 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Hombre |
|---|---|
| 2nd row | Mujer |
| 3rd row | Hombre |
| 4th row | Hombre |
| 5th row | Mujer |
Common Values
| Value | Count | Frequency (%) |
| Hombre | 628 | |
| Mujer | 625 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| hombre | 628 | |
| mujer | 625 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 506 |
|---|---|
| Distinct (%) | 40.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 258.9209896 |
| Minimum | 1 |
|---|---|
| Maximum | 812 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 45.6 |
| Q1 | 120 |
| median | 249 |
| Q3 | 360 |
| 95-th percentile | 537 |
| Maximum | 812 |
| Range | 811 |
| Interquartile range (IQR) | 240 |
Descriptive statistics
| Standard deviation | 161.6604268 |
|---|---|
| Coefficient of variation (CV) | 0.6243619995 |
| Kurtosis | 0.09545303296 |
| Mean | 258.9209896 |
| Median Absolute Deviation (MAD) | 120 |
| Skewness | 0.6353828691 |
| Sum | 324428 |
| Variance | 26134.09359 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 82 | 9 | 0.7% |
| 68 | 8 | 0.6% |
| 282 | 8 | 0.6% |
| 230 | 7 | 0.6% |
| 253 | 7 | 0.6% |
| 152 | 7 | 0.6% |
| 74 | 7 | 0.6% |
| 85 | 7 | 0.6% |
| 72 | 7 | 0.6% |
| 169 | 7 | 0.6% |
| Other values (496) | 1179 |
| Value | Count | Frequency (%) |
| 1 | 4 | |
| 2 | 5 | |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
| 5 | 1 | 0.1% |
| 6 | 1 | 0.1% |
| 7 | 3 | |
| 8 | 1 | 0.1% |
| 9 | 2 | 0.2% |
| 13 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 812 | 1 | |
| 795 | 1 | |
| 784 | 1 | |
| 782 | 1 | |
| 773 | 1 | |
| 764 | 1 | |
| 763 | 1 | |
| 751 | 1 | |
| 745 | 1 | |
| 743 | 1 |
| Distinct | 1217 |
|---|---|
| Distinct (%) | 97.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12463.42139 |
| Minimum | 56 |
|---|---|
| Maximum | 37747 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 56 |
|---|---|
| 5-th percentile | 2189.6 |
| Q1 | 5666 |
| median | 12121 |
| Q3 | 17319 |
| 95-th percentile | 25879.2 |
| Maximum | 37747 |
| Range | 37691 |
| Interquartile range (IQR) | 11653 |
Descriptive statistics
| Standard deviation | 7737.418553 |
|---|---|
| Coefficient of variation (CV) | 0.6208101541 |
| Kurtosis | 0.01706119907 |
| Mean | 12463.42139 |
| Median Absolute Deviation (MAD) | 5615 |
| Skewness | 0.5938021081 |
| Sum | 15616667 |
| Variance | 59867645.87 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 22604 | 3 | 0.2% |
| 7633 | 2 | 0.2% |
| 7047 | 2 | 0.2% |
| 11599 | 2 | 0.2% |
| 4503 | 2 | 0.2% |
| 2250 | 2 | 0.2% |
| 3659 | 2 | 0.2% |
| 2844 | 2 | 0.2% |
| 15014 | 2 | 0.2% |
| 1941 | 2 | 0.2% |
| Other values (1207) | 1232 |
| Value | Count | Frequency (%) |
| 56 | 1 | |
| 59 | 1 | |
| 61 | 1 | |
| 83 | 1 | |
| 100 | 1 | |
| 112 | 1 | |
| 116 | 1 | |
| 139 | 2 | |
| 143 | 1 | |
| 178 | 1 |
| Value | Count | Frequency (%) |
| 37747 | 1 | |
| 37577 | 1 | |
| 37328 | 1 | |
| 37226 | 1 | |
| 37009 | 1 | |
| 36636 | 1 | |
| 36304 | 1 | |
| 35322 | 1 | |
| 35206 | 1 | |
| 35124 | 1 |
| Distinct | 709 |
|---|---|
| Distinct (%) | 56.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 436.6855547 |
| Minimum | 1 |
|---|---|
| Maximum | 1868 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 61 |
| Q1 | 187 |
| median | 391 |
| Q3 | 621 |
| 95-th percentile | 1017.2 |
| Maximum | 1868 |
| Range | 1867 |
| Interquartile range (IQR) | 434 |
Descriptive statistics
| Standard deviation | 307.1605377 |
|---|---|
| Coefficient of variation (CV) | 0.7033906536 |
| Kurtosis | 0.8220393278 |
| Mean | 436.6855547 |
| Median Absolute Deviation (MAD) | 213 |
| Skewness | 0.9124112776 |
| Sum | 547167 |
| Variance | 94347.59593 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 78 | 6 | 0.5% |
| 61 | 6 | 0.5% |
| 265 | 6 | 0.5% |
| 108 | 6 | 0.5% |
| 86 | 5 | 0.4% |
| 291 | 5 | 0.4% |
| 73 | 5 | 0.4% |
| 499 | 5 | 0.4% |
| 546 | 5 | 0.4% |
| 564 | 5 | 0.4% |
| Other values (699) | 1199 |
| Value | Count | Frequency (%) |
| 1 | 4 | |
| 3 | 4 | |
| 4 | 3 | |
| 6 | 2 | |
| 8 | 2 | |
| 10 | 2 | |
| 11 | 2 | |
| 12 | 2 | |
| 17 | 1 | 0.1% |
| 21 | 2 |
| Value | Count | Frequency (%) |
| 1868 | 1 | |
| 1756 | 1 | |
| 1724 | 1 | |
| 1589 | 1 | |
| 1447 | 1 | |
| 1442 | 1 | |
| 1434 | 1 | |
| 1404 | 1 | |
| 1403 | 1 | |
| 1383 | 1 |
| Distinct | 757 |
|---|---|
| Distinct (%) | 60.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 502.0247406 |
| Minimum | 1 |
|---|---|
| Maximum | 2030 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 72 |
| Q1 | 225 |
| median | 462 |
| Q3 | 715 |
| 95-th percentile | 1132.2 |
| Maximum | 2030 |
| Range | 2029 |
| Interquartile range (IQR) | 490 |
Descriptive statistics
| Standard deviation | 344.4080121 |
|---|---|
| Coefficient of variation (CV) | 0.6860379264 |
| Kurtosis | 0.730253287 |
| Mean | 502.0247406 |
| Median Absolute Deviation (MAD) | 246 |
| Skewness | 0.8632385163 |
| Sum | 629037 |
| Variance | 118616.8788 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 99 | 6 | 0.5% |
| 121 | 6 | 0.5% |
| 114 | 6 | 0.5% |
| 327 | 5 | 0.4% |
| 94 | 5 | 0.4% |
| 676 | 5 | 0.4% |
| 120 | 5 | 0.4% |
| 243 | 5 | 0.4% |
| 300 | 5 | 0.4% |
| 140 | 5 | 0.4% |
| Other values (747) | 1200 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 2 | 3 | |
| 3 | 3 | |
| 4 | 3 | |
| 5 | 1 | 0.1% |
| 8 | 2 | |
| 10 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 12 | 2 | |
| 13 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2030 | 1 | |
| 2004 | 1 | |
| 1947 | 1 | |
| 1716 | 1 | |
| 1684 | 1 | |
| 1644 | 1 | |
| 1632 | 1 | |
| 1613 | 1 | |
| 1566 | 1 | |
| 1550 | 1 |
| Distinct | 755 |
|---|---|
| Distinct (%) | 60.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 503.9225858 |
| Minimum | 1 |
|---|---|
| Maximum | 2035 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 73.6 |
| Q1 | 222 |
| median | 460 |
| Q3 | 714 |
| 95-th percentile | 1144.6 |
| Maximum | 2035 |
| Range | 2034 |
| Interquartile range (IQR) | 492 |
Descriptive statistics
| Standard deviation | 346.2178504 |
|---|---|
| Coefficient of variation (CV) | 0.6870457093 |
| Kurtosis | 0.7970130385 |
| Mean | 503.9225858 |
| Median Absolute Deviation (MAD) | 246 |
| Skewness | 0.888872956 |
| Sum | 631415 |
| Variance | 119866.7999 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4 | 6 | 0.5% |
| 95 | 6 | 0.5% |
| 81 | 6 | 0.5% |
| 212 | 5 | 0.4% |
| 563 | 5 | 0.4% |
| 196 | 5 | 0.4% |
| 272 | 5 | 0.4% |
| 558 | 5 | 0.4% |
| 100 | 5 | 0.4% |
| 419 | 5 | 0.4% |
| Other values (745) | 1200 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 2 | 3 | |
| 4 | 6 | |
| 6 | 1 | 0.1% |
| 8 | 2 | 0.2% |
| 10 | 1 | 0.1% |
| 12 | 3 | |
| 14 | 2 | 0.2% |
| 18 | 2 | 0.2% |
| 21 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2035 | 1 | |
| 2023 | 1 | |
| 1960 | 1 | |
| 1762 | 1 | |
| 1727 | 1 | |
| 1702 | 1 | |
| 1565 | 1 | |
| 1553 | 1 | |
| 1547 | 1 | |
| 1545 | 1 |
| Distinct | 738 |
|---|---|
| Distinct (%) | 58.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 506.0925778 |
| Minimum | 2 |
|---|---|
| Maximum | 2042 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 74 |
| Q1 | 224 |
| median | 462 |
| Q3 | 718 |
| 95-th percentile | 1157.8 |
| Maximum | 2042 |
| Range | 2040 |
| Interquartile range (IQR) | 494 |
Descriptive statistics
| Standard deviation | 345.9558265 |
|---|---|
| Coefficient of variation (CV) | 0.6835820988 |
| Kurtosis | 0.7252845166 |
| Mean | 506.0925778 |
| Median Absolute Deviation (MAD) | 248 |
| Skewness | 0.864915738 |
| Sum | 634134 |
| Variance | 119685.4339 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 74 | 7 | 0.6% |
| 97 | 6 | 0.5% |
| 489 | 5 | 0.4% |
| 830 | 5 | 0.4% |
| 92 | 5 | 0.4% |
| 388 | 5 | 0.4% |
| 122 | 5 | 0.4% |
| 114 | 5 | 0.4% |
| 80 | 5 | 0.4% |
| 214 | 5 | 0.4% |
| Other values (728) | 1200 |
| Value | Count | Frequency (%) |
| 2 | 4 | |
| 3 | 1 | 0.1% |
| 4 | 5 | |
| 6 | 1 | 0.1% |
| 8 | 2 | 0.2% |
| 9 | 1 | 0.1% |
| 12 | 3 | |
| 13 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| 17 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2042 | 1 | |
| 2018 | 1 | |
| 1959 | 1 | |
| 1698 | 1 | |
| 1662 | 1 | |
| 1644 | 1 | |
| 1621 | 1 | |
| 1614 | 1 | |
| 1573 | 1 | |
| 1560 | 1 |
| Distinct | 688 |
|---|---|
| Distinct (%) | 54.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 416.2585794 |
| Minimum | 1 |
|---|---|
| Maximum | 1850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 55 |
| Q1 | 177 |
| median | 371 |
| Q3 | 598 |
| 95-th percentile | 972.4 |
| Maximum | 1850 |
| Range | 1849 |
| Interquartile range (IQR) | 421 |
Descriptive statistics
| Standard deviation | 295.9029208 |
|---|---|
| Coefficient of variation (CV) | 0.710863236 |
| Kurtosis | 1.127985552 |
| Mean | 416.2585794 |
| Median Absolute Deviation (MAD) | 205 |
| Skewness | 0.9833046605 |
| Sum | 521572 |
| Variance | 87558.53851 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 77 | 8 | 0.6% |
| 121 | 7 | 0.6% |
| 392 | 6 | 0.5% |
| 119 | 6 | 0.5% |
| 411 | 6 | 0.5% |
| 386 | 5 | 0.4% |
| 200 | 5 | 0.4% |
| 598 | 5 | 0.4% |
| 475 | 5 | 0.4% |
| 50 | 5 | 0.4% |
| Other values (678) | 1195 |
| Value | Count | Frequency (%) |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 2 | |
| 4 | 2 | |
| 6 | 1 | 0.1% |
| 8 | 2 | |
| 9 | 2 | |
| 10 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 12 | 2 |
| Value | Count | Frequency (%) |
| 1850 | 1 | |
| 1732 | 1 | |
| 1714 | 1 | |
| 1575 | 1 | |
| 1482 | 1 | |
| 1407 | 1 | |
| 1398 | 1 | |
| 1391 | 1 | |
| 1338 | 1 | |
| 1335 | 1 |
| Distinct | 820 |
|---|---|
| Distinct (%) | 65.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 600.6121309 |
| Minimum | 1 |
|---|---|
| Maximum | 2478 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 74 |
| Q1 | 243 |
| median | 537 |
| Q3 | 860 |
| 95-th percentile | 1357 |
| Maximum | 2478 |
| Range | 2477 |
| Interquartile range (IQR) | 617 |
Descriptive statistics
| Standard deviation | 428.0352277 |
|---|---|
| Coefficient of variation (CV) | 0.7126649724 |
| Kurtosis | 0.9360552387 |
| Mean | 600.6121309 |
| Median Absolute Deviation (MAD) | 306 |
| Skewness | 0.919792499 |
| Sum | 752567 |
| Variance | 183214.1561 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 96 | 7 | 0.6% |
| 231 | 7 | 0.6% |
| 4 | 5 | 0.4% |
| 207 | 5 | 0.4% |
| 654 | 5 | 0.4% |
| 97 | 5 | 0.4% |
| 166 | 5 | 0.4% |
| 121 | 4 | 0.3% |
| 705 | 4 | 0.3% |
| 321 | 4 | 0.3% |
| Other values (810) | 1202 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 2 | 3 | |
| 4 | 5 | |
| 6 | 1 | 0.1% |
| 7 | 2 | 0.2% |
| 9 | 1 | 0.1% |
| 11 | 1 | 0.1% |
| 12 | 2 | 0.2% |
| 13 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2478 | 1 | |
| 2413 | 1 | |
| 2404 | 1 | |
| 2395 | 1 | |
| 2179 | 1 | |
| 2118 | 1 | |
| 2020 | 1 | |
| 2006 | 1 | |
| 2001 | 1 | |
| 1997 | 1 |
CARDIOVASCULAR
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 740 |
|---|---|
| Distinct (%) | 59.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 502.7039106 |
| Minimum | 1 |
|---|---|
| Maximum | 2011 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 73 |
| Q1 | 220 |
| median | 464 |
| Q3 | 711 |
| 95-th percentile | 1142.6 |
| Maximum | 2011 |
| Range | 2010 |
| Interquartile range (IQR) | 491 |
Descriptive statistics
| Standard deviation | 343.3287706 |
|---|---|
| Coefficient of variation (CV) | 0.6829641928 |
| Kurtosis | 0.6194867016 |
| Mean | 502.7039106 |
| Median Absolute Deviation (MAD) | 246 |
| Skewness | 0.8411999569 |
| Sum | 629888 |
| Variance | 117874.6447 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 99 | 8 | 0.6% |
| 94 | 7 | 0.6% |
| 536 | 7 | 0.6% |
| 4 | 6 | 0.5% |
| 572 | 6 | 0.5% |
| 576 | 6 | 0.5% |
| 153 | 6 | 0.5% |
| 259 | 5 | 0.4% |
| 75 | 5 | 0.4% |
| 692 | 5 | 0.4% |
| Other values (730) | 1192 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 2 | 3 | |
| 4 | 6 | |
| 6 | 1 | 0.1% |
| 7 | 1 | 0.1% |
| 8 | 1 | 0.1% |
| 9 | 1 | 0.1% |
| 10 | 1 | 0.1% |
| 12 | 2 | 0.2% |
| 13 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2011 | 1 | |
| 1889 | 1 | |
| 1852 | 1 | |
| 1730 | 1 | |
| 1701 | 1 | |
| 1672 | 1 | |
| 1610 | 1 | |
| 1541 | 1 | |
| 1530 | 1 | |
| 1523 | 1 |
| Distinct | 714 |
|---|---|
| Distinct (%) | 57.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 449.8747007 |
| Minimum | 1 |
|---|---|
| Maximum | 1888 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 64 |
| Q1 | 199 |
| median | 404 |
| Q3 | 631 |
| 95-th percentile | 1014.8 |
| Maximum | 1888 |
| Range | 1887 |
| Interquartile range (IQR) | 432 |
Descriptive statistics
| Standard deviation | 311.4409945 |
|---|---|
| Coefficient of variation (CV) | 0.6922838604 |
| Kurtosis | 0.9874545479 |
| Mean | 449.8747007 |
| Median Absolute Deviation (MAD) | 218 |
| Skewness | 0.925003129 |
| Sum | 563693 |
| Variance | 96995.49307 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 389 | 7 | 0.6% |
| 519 | 6 | 0.5% |
| 93 | 6 | 0.5% |
| 465 | 6 | 0.5% |
| 214 | 6 | 0.5% |
| 358 | 6 | 0.5% |
| 68 | 5 | 0.4% |
| 84 | 5 | 0.4% |
| 86 | 5 | 0.4% |
| 124 | 5 | 0.4% |
| Other values (704) | 1196 |
| Value | Count | Frequency (%) |
| 1 | 3 | |
| 2 | 2 | |
| 3 | 3 | |
| 4 | 2 | |
| 5 | 1 | 0.1% |
| 7 | 1 | 0.1% |
| 8 | 1 | 0.1% |
| 9 | 2 | |
| 10 | 2 | |
| 12 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 1888 | 1 | |
| 1851 | 1 | |
| 1760 | 1 | |
| 1722 | 1 | |
| 1503 | 1 | |
| 1447 | 2 | |
| 1439 | 1 | |
| 1434 | 1 | |
| 1426 | 1 | |
| 1417 | 1 |
| Distinct | 757 |
|---|---|
| Distinct (%) | 60.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 495.1851556 |
| Minimum | 1 |
|---|---|
| Maximum | 2082 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 71.6 |
| Q1 | 217 |
| median | 448 |
| Q3 | 705 |
| 95-th percentile | 1135.4 |
| Maximum | 2082 |
| Range | 2081 |
| Interquartile range (IQR) | 488 |
Descriptive statistics
| Standard deviation | 341.103365 |
|---|---|
| Coefficient of variation (CV) | 0.6888400453 |
| Kurtosis | 0.7689435349 |
| Mean | 495.1851556 |
| Median Absolute Deviation (MAD) | 244 |
| Skewness | 0.8822430826 |
| Sum | 620467 |
| Variance | 116351.5056 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 114 | 6 | 0.5% |
| 137 | 6 | 0.5% |
| 554 | 6 | 0.5% |
| 375 | 5 | 0.4% |
| 587 | 5 | 0.4% |
| 96 | 5 | 0.4% |
| 240 | 5 | 0.4% |
| 106 | 5 | 0.4% |
| 478 | 5 | 0.4% |
| 91 | 5 | 0.4% |
| Other values (747) | 1200 |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 2 | 3 | |
| 3 | 1 | 0.1% |
| 4 | 5 | |
| 6 | 1 | 0.1% |
| 8 | 2 | 0.2% |
| 9 | 1 | 0.1% |
| 12 | 4 | |
| 13 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2082 | 1 | |
| 2002 | 1 | |
| 1828 | 1 | |
| 1692 | 1 | |
| 1661 | 1 | |
| 1617 | 1 | |
| 1608 | 1 | |
| 1549 | 1 | |
| 1537 | 1 | |
| 1531 | 1 |
| Distinct | 750 |
|---|---|
| Distinct (%) | 59.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 498.6256983 |
| Minimum | 2 |
|---|---|
| Maximum | 2062 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 73.6 |
| Q1 | 219 |
| median | 458 |
| Q3 | 703 |
| 95-th percentile | 1123.4 |
| Maximum | 2062 |
| Range | 2060 |
| Interquartile range (IQR) | 484 |
Descriptive statistics
| Standard deviation | 341.004698 |
|---|---|
| Coefficient of variation (CV) | 0.6838891359 |
| Kurtosis | 0.8528495173 |
| Mean | 498.6256983 |
| Median Absolute Deviation (MAD) | 242 |
| Skewness | 0.8787988415 |
| Sum | 624778 |
| Variance | 116284.204 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4 | 6 | 0.5% |
| 373 | 5 | 0.4% |
| 351 | 5 | 0.4% |
| 326 | 5 | 0.4% |
| 91 | 5 | 0.4% |
| 110 | 5 | 0.4% |
| 555 | 5 | 0.4% |
| 518 | 5 | 0.4% |
| 115 | 5 | 0.4% |
| 296 | 5 | 0.4% |
| Other values (740) | 1202 |
| Value | Count | Frequency (%) |
| 2 | 4 | |
| 3 | 1 | 0.1% |
| 4 | 6 | |
| 7 | 1 | 0.1% |
| 8 | 1 | 0.1% |
| 10 | 1 | 0.1% |
| 11 | 2 | 0.2% |
| 12 | 1 | 0.1% |
| 13 | 1 | 0.1% |
| 14 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 2062 | 1 | |
| 1993 | 1 | |
| 1901 | 1 | |
| 1861 | 1 | |
| 1824 | 1 | |
| 1705 | 1 | |
| 1596 | 1 | |
| 1567 | 1 | |
| 1566 | 1 | |
| 1523 | 1 |
CLASIFICACION_FINAL
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 775 |
|---|---|
| Distinct (%) | 61.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 570.1747805 |
| Minimum | 2 |
|---|---|
| Maximum | 1678 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 9.9 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 105.6 |
| Q1 | 269 |
| median | 550 |
| Q3 | 786 |
| 95-th percentile | 1162.6 |
| Maximum | 1678 |
| Range | 1676 |
| Interquartile range (IQR) | 517 |
Descriptive statistics
| Standard deviation | 347.0110815 |
|---|---|
| Coefficient of variation (CV) | 0.608604753 |
| Kurtosis | -0.111881788 |
| Mean | 570.1747805 |
| Median Absolute Deviation (MAD) | 259 |
| Skewness | 0.5533832436 |
| Sum | 714429 |
| Variance | 120416.6907 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 232 | 6 | 0.5% |
| 171 | 6 | 0.5% |
| 6 | 5 | 0.4% |
| 458 | 5 | 0.4% |
| 676 | 5 | 0.4% |
| 552 | 5 | 0.4% |
| 696 | 5 | 0.4% |
| 565 | 5 | 0.4% |
| 613 | 4 | 0.3% |
| 211 | 4 | 0.3% |
| Other values (765) | 1203 |
| Value | Count | Frequency (%) |
| 2 | 2 | 0.2% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
| 6 | 5 | |
| 8 | 1 | 0.1% |
| 9 | 1 | 0.1% |
| 12 | 1 | 0.1% |
| 15 | 2 | 0.2% |
| 18 | 2 | 0.2% |
| 19 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 1678 | 1 | |
| 1670 | 1 | |
| 1661 | 1 | |
| 1653 | 1 | |
| 1648 | 1 | |
| 1645 | 1 | |
| 1595 | 1 | |
| 1590 | 1 | |
| 1585 | 1 | |
| 1574 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| FECHA_DEF | SEXO | NEUMONIA | EDAD | DIABETES | EPOC | ASMA | INMUSUPR | HIPERTENSION | OTRA_COM | CARDIOVASCULAR | OBESIDAD | RENAL_CRONICA | TABAQUISMO | CLASIFICACION_FINAL | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2020-03-18 | Hombre | 2 | 116 | 3 | 4 | 4 | 4 | 3 | 4 | 4 | 3 | 4 | 4 | 6 |
| 1 | 2020-03-18 | Mujer | 1 | 83 | 1 | 1 | 1 | 2 | 1 | 98 | 1 | 2 | 2 | 2 | 2 |
| 2 | 2020-03-20 | Hombre | 1 | 56 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 2 | 3 |
| 3 | 2020-03-22 | Hombre | 2 | 139 | 3 | 3 | 4 | 3 | 2 | 4 | 4 | 4 | 3 | 4 | 4 |
| 4 | 2020-03-23 | Mujer | 1 | 61 | 1 | 2 | 2 | 2 | 1 | 1 | 2 | 1 | 2 | 2 | 3 |
| 5 | 2020-03-24 | Hombre | 3 | 178 | 4 | 5 | 6 | 6 | 4 | 6 | 6 | 5 | 6 | 4 | 8 |
| 6 | 2020-03-25 | Hombre | 2 | 112 | 3 | 3 | 4 | 4 | 2 | 2 | 4 | 3 | 4 | 3 | 6 |
| 7 | 2020-03-26 | Hombre | 6 | 345 | 10 | 12 | 12 | 12 | 9 | 12 | 12 | 9 | 12 | 12 | 18 |
| 8 | 2020-03-26 | Mujer | 2 | 139 | 3 | 4 | 4 | 4 | 2 | 4 | 4 | 2 | 4 | 4 | 6 |
| 9 | 2020-03-27 | Hombre | 7 | 404 | 12 | 13 | 14 | 14 | 14 | 14 | 14 | 13 | 13 | 14 | 20 |
Last rows
| FECHA_DEF | SEXO | NEUMONIA | EDAD | DIABETES | EPOC | ASMA | INMUSUPR | HIPERTENSION | OTRA_COM | CARDIOVASCULAR | OBESIDAD | RENAL_CRONICA | TABAQUISMO | CLASIFICACION_FINAL | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1243 | 2021-12-04 | Hombre | 64 | 3281 | 84 | 102 | 101 | 100 | 83 | 97 | 100 | 93 | 94 | 95 | 153 |
| 1244 | 2021-12-04 | Mujer | 36 | 1857 | 43 | 53 | 56 | 53 | 44 | 53 | 56 | 51 | 54 | 55 | 84 |
| 1245 | 2021-12-05 | Hombre | 75 | 3691 | 87 | 105 | 108 | 110 | 84 | 106 | 105 | 102 | 103 | 102 | 164 |
| 1246 | 2021-12-05 | Mujer | 46 | 2233 | 48 | 66 | 68 | 68 | 43 | 68 | 66 | 59 | 63 | 67 | 102 |
| 1247 | 2021-12-06 | Hombre | 63 | 2979 | 174 | 192 | 191 | 189 | 170 | 188 | 188 | 184 | 185 | 186 | 144 |
| 1248 | 2021-12-06 | Mujer | 52 | 2461 | 59 | 71 | 73 | 74 | 50 | 73 | 73 | 67 | 72 | 71 | 109 |
| 1249 | 2021-12-07 | Hombre | 45 | 2349 | 55 | 68 | 70 | 69 | 51 | 69 | 69 | 65 | 64 | 66 | 105 |
| 1250 | 2021-12-07 | Mujer | 28 | 1418 | 31 | 37 | 40 | 38 | 25 | 132 | 39 | 32 | 37 | 40 | 60 |
| 1251 | 2021-12-08 | Hombre | 8 | 315 | 6 | 8 | 8 | 8 | 6 | 7 | 7 | 8 | 8 | 8 | 12 |
| 1252 | 2021-12-08 | Mujer | 13 | 632 | 11 | 16 | 18 | 17 | 11 | 18 | 18 | 14 | 18 | 18 | 27 |